<!DOCTYPE html>
<html lang="en">
<head>
	<meta charset="UTF-8">
	<meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
	<title>Common grams token filter | ElasticSearch 7.7 权威指南中文版</title>
	<meta name="keywords" content="ElasticSearch 权威指南中文版, elasticsearch 7, es7, 实时数据分析，实时数据检索" />
    <meta name="description" content="ElasticSearch 权威指南中文版, elasticsearch 7, es7, 实时数据分析，实时数据检索" />
    <!-- Give IE8 a fighting chance -->
    <!--[if lt IE 9]>
    <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
    <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
    <![endif]-->
	<link rel="stylesheet" type="text/css" href="../static/styles.css" />
	<script>
	var _link = 'analysis-common-grams-tokenfilter.html';
    </script>
</head>
<body>
<div class="main-container">
    <section id="content">
        <div class="content-wrapper">
            <section id="guide" lang="zh_cn">
                <div class="container">
                    <div class="row">
                        <div class="col-xs-12 col-sm-8 col-md-8 guide-section">
                            <div style="color:gray; word-break: break-all; font-size:12px;">原英文版地址: <a href="https://www.elastic.co/guide/en/elasticsearch/reference/7.7/analysis-common-grams-tokenfilter.html" rel="nofollow" target="_blank">https://www.elastic.co/guide/en/elasticsearch/reference/7.7/analysis-common-grams-tokenfilter.html</a>, 原文档版权归 www.elastic.co 所有<br/>本地英文版地址: <a href="../en/analysis-common-grams-tokenfilter.html" rel="nofollow" target="_blank">../en/analysis-common-grams-tokenfilter.html</a></div>
                        <!-- start body -->
                  <div class="page_header">
<strong>重要</strong>: 此版本不会发布额外的bug修复或文档更新。最新信息请参考 <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html" rel="nofollow">当前版本文档</a>。
</div>
<div id="content">
<div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.html">Elasticsearch Guide [7.7]</a></span>
»
<span class="breadcrumb-link"><a href="analysis.html">Text analysis</a></span>
»
<span class="breadcrumb-link"><a href="analysis-tokenfilters.html">Token filter reference</a></span>
»
<span class="breadcrumb-node">Common grams token filter</span>
</div>
<div class="navheader">
<span class="prev">
<a href="analysis-classic-tokenfilter.html">« Classic token filter</a>
</span>
<span class="next">
<a href="analysis-condition-tokenfilter.html">Conditional token filter »</a>
</span>
</div>
<div class="section">
<div class="titlepage"><div><div>
<h2 class="title">
<a id="analysis-common-grams-tokenfilter"></a>Common grams token filter<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/tokenfilters/common-grams-tokenfilter.asciidoc">edit</a>
</h2>
</div></div></div>

<p>Generates <a href="https://en.wikipedia.org/wiki/Bigram" class="ulink" target="_top">bigrams</a> for a specified set of
common words.</p>
<p>For example, you can specify <code class="literal">is</code> and <code class="literal">the</code> as common words. This filter then
converts the tokens <code class="literal">[the, quick, fox, is, brown]</code> to <code class="literal">[the, the_quick, quick,
fox, fox_is, is, is_brown, brown]</code>.</p>
<p>You can use the <code class="literal">common_grams</code> filter in place of the
<a class="xref" href="analysis-stop-tokenfilter.html" title="Stop token filter">stop token filter</a> when you don’t want to
completely ignore common words.</p>
<p>This filter uses Lucene’s
<a href="https://lucene.apache.org/core/8_5_1/analyzers-common/org/apache/lucene/analysis/commongrams/CommonGramsFilter.html" class="ulink" target="_top">CommonGramsFilter</a>.</p>
<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="analysis-common-grams-analyze-ex"></a>Example<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/tokenfilters/common-grams-tokenfilter.asciidoc">edit</a>
</h3>
</div></div></div>
<p>The following <a class="xref" href="indices-analyze.html" title="Analyze API">analyze API</a> request creates bigrams for <code class="literal">is</code>
and <code class="literal">the</code>:</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">GET /_analyze
{
  "tokenizer" : "whitespace",
  "filter" : [
    {
      "type": "common_grams",
      "common_words": ["is", "the"]
    }
  ],
  "text" : "the quick fox is brown"
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/904.console"></div>
<p>The filter produces the following tokens:</p>
<div class="pre_wrapper lang-text">
<pre class="programlisting prettyprint lang-text">[ the, the_quick, quick, fox, fox_is, is, is_brown, brown ]</pre>
</div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="analysis-common-grams-tokenfilter-analyzer-ex"></a>Add to an analyzer<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/tokenfilters/common-grams-tokenfilter.asciidoc">edit</a>
</h3>
</div></div></div>
<p>The following <a class="xref" href="indices-create-index.html" title="Create index API">create index API</a> request uses the
<code class="literal">common_grams</code> filter to configure a new
<a class="xref" href="analysis-custom-analyzer.html" title="Create a custom analyzer">custom analyzer</a>:</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">PUT /common_grams_example
{
    "settings": {
        "analysis": {
            "analyzer": {
              "index_grams": {
                  "tokenizer": "whitespace",
                  "filter": ["common_grams"]
              }
            },
            "filter": {
              "common_grams": {
                  "type": "common_grams",
                  "common_words": ["a", "is", "the"]
              }
            }
        }
    }
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/905.console"></div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="analysis-common-grams-tokenfilter-configure-parms"></a>Configurable parameters<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/tokenfilters/common-grams-tokenfilter.asciidoc">edit</a>
</h3>
</div></div></div>
<div class="variablelist">
<dl class="variablelist">
<dt>
<span class="term">
<code class="literal">common_words</code>
</span>
</dt>
<dd>
<p>(Required*, array of strings)
A list of tokens. The filter generates bigrams for these tokens.</p>
<p>Either this or the <code class="literal">common_words_path</code> parameter is required.</p>
</dd>
<dt>
<span class="term">
<code class="literal">common_words_path</code>
</span>
</dt>
<dd>
<p>(Required*, string)
Path to a file containing a list of tokens. The filter generates bigrams for
these tokens.</p>
<p>This path must be absolute or relative to the <code class="literal">config</code> location. The file must
be UTF-8 encoded. Each token in the file must be separated by a line break.</p>
<p>Either this or the <code class="literal">common_words</code> parameter is required.</p>
</dd>
<dt>
<span class="term">
<code class="literal">ignore_case</code>
</span>
</dt>
<dd>
(Optional, boolean)
If <code class="literal">true</code>, matches for common words matching are case-insensitive.
Defaults to <code class="literal">false</code>.
</dd>
<dt>
<span class="term">
<code class="literal">query_mode</code>
</span>
</dt>
<dd>
<p>(Optional, boolean)
If <code class="literal">true</code>, the filter excludes the following tokens from the output:</p>
<div class="ulist itemizedlist">
<ul class="itemizedlist">
<li class="listitem">
Unigrams for common words
</li>
<li class="listitem">
Unigrams for terms followed by common words
</li>
</ul>
</div>
<p>Defaults to <code class="literal">false</code>. We recommend enabling this parameter for
<a class="xref" href="search-analyzer.html" title="search_analyzer">search analyzers</a>.</p>
<p>For example, you can enable this parameter and specify <code class="literal">is</code> and <code class="literal">the</code> as
common words. This filter converts the tokens <code class="literal">[the, quick, fox, is, brown]</code> to
<code class="literal">[the_quick, quick, fox_is, is_brown,]</code>.</p>
</dd>
</dl>
</div>
</div>

<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="analysis-common-grams-tokenfilter-customize"></a>Customize<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch/edit/7.7/docs/reference/analysis/tokenfilters/common-grams-tokenfilter.asciidoc">edit</a>
</h3>
</div></div></div>
<p>To customize the <code class="literal">common_grams</code> filter, duplicate it to create the basis
for a new custom token filter. You can modify the filter using its configurable
parameters.</p>
<p>For example, the following request creates a custom <code class="literal">common_grams</code> filter with
<code class="literal">ignore_case</code> and <code class="literal">query_mode</code> set to <code class="literal">true</code>:</p>
<div class="pre_wrapper lang-console">
<pre class="programlisting prettyprint lang-console">PUT /common_grams_example
{
    "settings": {
        "analysis": {
            "analyzer": {
              "index_grams": {
                  "tokenizer": "whitespace",
                  "filter": ["common_grams_query"]
              }
            },
            "filter": {
              "common_grams_query": {
                  "type": "common_grams",
                  "common_words": ["a", "is", "the"],
                  "ignore_case": true,
                  "query_mode": true
              }
            }
        }
    }
}</pre>
</div>
<div class="console_widget" data-snippet="snippets/906.console"></div>
</div>

</div>
<div class="navfooter">
<span class="prev">
<a href="analysis-classic-tokenfilter.html">« Classic token filter</a>
</span>
<span class="next">
<a href="analysis-condition-tokenfilter.html">Conditional token filter »</a>
</span>
</div>
</div>

                  <!-- end body -->
                        </div>
                        <div class="col-xs-12 col-sm-4 col-md-4" id="right_col">
                        
                        </div>
                    </div>
                </div>
            </section>
        </div>
    </section>
</div>
<script src="../static/cn.js"></script>
</body>
</html>